Automatic Word Sense Disambiguation (wsd) System

نویسندگان

  • Kevin Indrebo
  • Jidong Tao
  • Marek Trawicki
چکیده

This paper presents an automatic word sense disambiguation (WSD) system that uses Part-of-Speech (POS) tags along with word classes as the discrete features. Word Classes are derived from the Word Class Assigner using the Word Exchange Algorithm from statistical language processing. Naïve-Bayes classifier is employed from Weka in both the training and testing phases to perform the supervised learning on the standard Senseval-3 data set. Experiments were performing using 10-fold cross-validation on the training set and the training and testing data for training the model and evaluating it. In both experiments, the features will either used separately or combined together to produce the accuracies. Results indicate that word class features did not provide any discrimination for word sense disambiguation. POS tag features produced a small improvement over the baseline. The combination of both word class and POS tag features did not increase the accuracy results. Overall, further study is likely needed to possibly improve the implementation of the word class features in the system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CITYU-HIF: WSD with Human-Informed Feature Preference

This paper describes our word sense disambiguation (WSD) system participating in the SemEval-2007 tasks. The core system is a fully supervised system based on a Naïve Bayes classifier using multiple knowledge sources. Toward a larger goal of incorporating the intrinsic nature of individual target words in disambiguation, thus introducing a cognitive element in automatic WSD, we tried to fine-tu...

متن کامل

Different Sense Granularities For Different Applications

This paper describes an hierarchical approach to WordNet sense distinctions that provides different types of automatic Word Sense Disambiguation (WSD) systems, which perform at varying levels of accuracy. For tasks where fine-grained sense distinctions may not be essential, an accurate coarse-grained WSD system may be sufficient. The paper discusses the criteria behind the three different level...

متن کامل

Subcategorization Acquisition as an Evaluation Method for WSD

Evaluation of word sense disambiguation (WSD) systems is often based on machine-readable dictionaries (MRDs). Such evaluation typically employs a set of fine-grained dictionary senses and considers them all to be equally important. In this paper, we propose a novel evaluation method for WSD systems in the context of automatic subcategorization acquisition. Building on an extant subcategorizatio...

متن کامل

Word Sense Disambiguation: A Case Study on the Granularity of Sense Distinctions

The paper presents a method for word sense disambiguation (WSD) based on parallel corpora. The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and is supported by a lexical ontology made of aligned wordnets for the languages in the corpora. The wordnets are aligned to the Princeton Wordnet, according to the principle...

متن کامل

Improving Statistical Machine Translation Using Word Sense Disambiguation

We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrase-based statistical machine translation (SMT) model consistently improves translation quality across all three different IWSLT ChineseEnglish test sets, as well as producing statistically significant improvements on the larger NIST Chinese-English MT task— and moreover never...

متن کامل

COLEPL and COLSLM: An Unsupervised WSD Approach to Multilingual Lexical Substitution, Tasks 2 and 3 SemEval 2010

In this paper, we present a word sense disambiguation (WSD) based system for multilingual lexical substitution. Our method depends on having a WSD system for English and an automatic word alignment method. Crucially the approach relies on having parallel corpora. For Task 2 (Sinha et al., 2009) we apply a supervised WSD system to derive the English word senses. For Task 3 (Lefever & Hoste, 2009...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005